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In this paper, we present a unified framework for decision making under uncertainty. Our framework is based 
on the composite of two risk measures, where the inner risk measure accounts for the risk of decision given 
the exact distribution of uncertain model parameters, and the outer risk measure quantihes the risk that 
occurs when estimating the parameters of distribution. We show that the model is tractable under mild 
conditions. The framework is a generalization of several existing models, including stochastic programming, 
robust optimization, distributionally robust optimization, etc. Using this framework, we study a few new 
models which imply probabilistic guarantees for solutions and yield less conservative results comparing to 
traditional models. Numerical experiments are performed on portfolio selection problems to demonstrate the 
strength of our models. 


1. Introduction 

In this paper, we consider a decision maker who wants to minimize an objective function 
where x G R” is the decision variable and ^ G R® is some uncertain/unknown parameter related 
to the model. For example, in a newsvendor problem, x is the order amount of newspapers by a 
newsvendor, and ^ is the uncertain future demand. Similarly, in an investment problem, x is the 
portfolio chosen by a portfolio manager, and ^ is the unknown future returns of the instruments. 
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The existence of the uncertain parameters distinguishes the problem from ordinary optimization 
problems and has led to several decision making paradigms. 

One of th e earliest attem pts to deal with such decision making problems under uncertainty was 
proposed bv lDantzid (|19551) . where it was assumed that the distribution of ^ is known exactly and 


the decision is chosen to minimize the expectation of if(x, ^). Such an app roach is called stochastic 
programming. Another approach named robust optimization initiated by ISovsteii (jl973|l supposes 


that all possible values of ^ lie within an uncertainty set, and the decision should be made to 
minimize the worst-case value of Stochastic programming and robust optimization models 

can be viewed as two extremes in the spectrum of available information in decision making under 
uncertai nty. There are models in between these two extremes. Fo r example, distributionally robust 


models (Scarf et al 


1958 


Dupacova 


1987 


Pelage and Ye 


20 id. etc.) take into account both the 


stochastic and the robust aspects, where the distribution of ^ is assumed to belong to a certain 
distribution set and the worst-case expectation of 77(x,£) is minimized. There are also various 


models which minimize cer tain risk ( RxmkajellaiLjJulTIrYasey 


etc.) or the worst-case risk ()E1 Ghaoui et al.l 


2003 


20oJ 


Gaivoronski and Pflug 


Zhu and Fukushima 


i3, 


2005 


20091. etc.) of We 


will present a more detailed review of these models in Section [2J In addition to the study of indi- 


put fo rward more general models. For instance, 


ence, 


Bertsimas and Brown 

(2009 

) and 

Nataraian et al. 


(|200S) show that uncertain ty sets can be constructed according to decision maker’s risk prefer- 


Rertsimas et al 


Wiesemapu et al 


( 20141 ) propose a general framework for data-driven robust optimization, and 


(j2014^ propose a framework for distributionally robust models. 


While some models have demonstrated their effectiveness in practice, there are still some ignored 
issues in the existing literature: 

1. There lacks a unified framework which includes all the models above, namely, stochastic 
programming model, robust optimization model, distributionally robust model and worst-case risk 


models. 
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2. Thoug h risk measures have been imposed on the 


uncertainty (jArtzner et al, 


1999I. 


Bertsimas and Brown 


objec tive function to deal with parameter 


2009I. etc.), no attempt has been made to 


impose risk measure on the expectation or other functionals of the objective function with respect 
to distributional uncertainty. 

3. Bayesian approach has not been fully considered when modeling decision making under dis¬ 
tributional uncertainty in the existing literature, albeit its appropriateness for such problems. 

The goal of our paper is to fill these gaps and build a unified modeling framework for deci¬ 
sion making under uncertainty. Our unified framework is based on a risk measure interpretation 
for robustness and encompasses several popular models, such as stochastic programming, robust 
optimization, distributionally robust optimization, worst-case risk models, etc. Specifically, we min¬ 
imize the composite of two risk measures, where the inner risk measure accounts for the risk of 
decision given the exact distribution of parameters, and the outer risk measure quantifies the risk 
that occurs when estimating the parameters of distribution. For the outer risk measure, we take 
a Bayesian approach and consider the posterior distribution of distribution parameters. We show 
that the composite of risk measures is convex as long as both the inner and the outer risk mea¬ 
sures are convex risk measures. We also use this framework to construct several new models which 
have real world meanings and perform numerical tests on these models which demonstrate their 
strength. 

We summarize our contributions as follows: 

1. We propose a composite risk measure (CRM) framework for decision making under uncer¬ 
tainty where the composite of two risk measures is minimized. It is a generalization of several 
existing models. We show that the corresponding optimization problem is convex under mild con¬ 
ditions. 

2. We take a novel approach to deal with distributional uncertainty by making use of risk measure 
and Bayesian posterior distribution of distribution parameters. 


3. Using the composite risk measure framework, we study a VaR-Expectation model, a CVaR- 
Expectation model and a CVaR-CVaR model, investigating their tractability and probabilistic 
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guarantees. Numerical experiments show that these models can be solved efficiently and perform 
well in portfolio selection problems. 

The remainder of this paper is organized as follows. In Section [21 we briefly review the existing 
models for decision making under uncertainty. In Section [3l we present our composite risk measure 
framework and show that several existing models fall into the framework. Several new models 
within the general framework are proposed in Section jH Numerical results for these new models 
are shown in Section (5) 

Notations. Throughout the paper, the following notations will be used. Ordinary lower case 
letters {x,y,...) denote scalars, boldfaced lower case letters (x, y,...) denote vectors. Specifi¬ 
cally, X G R" denotes the decision variable and ^ G R® denotes the uncertain/unknown parameters, 
is the loss function under decision x and parameter 

2. Review of Existing Models for Decision Making under Uncertainty 

There have been many models proposed in the literature that study decision making problem under 
uncertainty. In this section, we provide a review of those existing models. 


2.1. Stochastic Programming 

One of the most popular ways to solve decision making problem under uncertainty is through 
stochastic programming. In stochastic programming models, one assumes that the full distribu¬ 
tional information of ^ is available. Then to choose an optimal decision variable x, one considers a 
certain functional of the random loss if(x , £). E xamples of this approach include: 


Expectation optimization. 


Dantzid (Il9551 ) considers the case where the objective is to min¬ 


imize the expectation of the random loss Namely, the optimization problem is 


min E[i/(x, ^)]. 


( 1 ) 


There are much li terature that study such optimization problems. We refer interested readers to 


Shapiro et al. 


(|20091 ) for a comprehensive review. 
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Value-at-Risk (VaR) optimization. VaR has enjoyed great po pularity as a measu re of down 


side risk in finance industry since its introduction in the 1990s (see 


RiskMetrics 


199G|). For fixed 


X, the (5-VaR of R(x, is defined as the 6 quantile of the random loss R(x,^). Mathematically, 


VaRsiH{x, ^)) = inf {t G R|P(R(x, 


( 2 ) 


The corresponding VaR optimization problem can be written as: 


min t 

X.t 


( 3 ) 


s.t. P(R(x,^) ^ t) ^ 1 — (5. 


The VaR optimization problem 


(Il998h 


ras been studied extensively in the literature, see, e.g. JKast et al. 


Lucas and KlaassenI (jl998f ). etc. However, there are three main drawbacks of VaR. First, 


it doesn’t take into account the magnitude of los s beyond VaR, resu l ting in decision maker’s 


preference f or taking ‘excessive but remote’ risks (jEinhorn and Brown 


subadditive (Artzner et al 


20081) . Second, it is not 


19991 ), meaning that the VaR of a combined portfolio can be larger than 


the sum of the VaRs of its components, a pr operty that is not desired in many applications. Lastly, 


20111 ). making the optimization 


VaRs{H{x,^)) is usually not convex in x (jBirge and Louveaux 
problem intractable. 

Conditional Valne-at-Risk (CVaR) optimization. To overcome the drawbacks of VaR, 
researchers further proposed a modified version of VaR, the CVaR (also called the expected short¬ 
fall in some literature). For a certain random loss X, the (i-CVaR is the expected loss in the worst 
1 — 5 cases. Mathematically, it can be written as: 

1 


CXaRsiX) = 


1-5 


YaR,{X)ds. 


For atomless distributions, CVaR is equivalent to the conditional expectation of the loss beyond 
VaR, namely. 


CVaR5(V) = E[V|V ^ VaR5(V)]. 
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20ol 


Rockafellar and UrvasevI (|2000r) show that CVaR can be obtained by solving the following convex 


program: 


min a + 




(-I) 


which leads to another definition of CVaR. This formulation brings computational convenience as 
the objective function is explicit and convex in a. 


2.2. Robust Optimization 

Another popular way for decision making under uncertainty is to use a robust optimization 
approach. In the robust optimization approach, instead of assuming the distributional knowledge 
of one only assumes that ^ takes values in a certain uncertainty set S. Then, when choosing the 
decision variable, one considers the worst-case outcome associated with each decision, where the 
worst-case scenario is chosen from the specified uncertainty set. The optimization problem can be 
written as follows: 


min max i^(x, ^). 

X 


( 5 ) 


Robust optimization problems have more general 
function are affected by parameter uncertainty (e.g. 


brms where constraints instead of objective 


Bertsimas et al. 


was first proposed by 


20111 1. Robust optimization 


Sovsted (jl973ll and has attracted much attention in the past 


For a comprehensiv e review of the literature, we refer readers to 


Bertsimas et al 


ew decades. 


Ben-Tal et al 


(120091) 


( 20111) and 


Choosing a suitable uncertainty set is essential in formulating a robust optimization problem. 
Two main issues should be taken into consideration when designing uncertainty sets: 

Tractability. Only a few uncertainty sets will lead to tractable c ounterparts for the orig¬ 


inal problem. Some known cases include polyhedral uncertainty set ( Ben-Tal and Ne mirovski 


19991). ellipsoida 


El Ghaoui et al 


uncertainty set (|Ben-Tal and Nemirovski 


1999 


e: 


19981 ). norm uncertainty set (jBertsimas et al 


Chaoui and Lebret 


1997 and 


2004) . etc. 
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Conservativeness. Intuitively, when the uncertainty set is very large, the resulting decision will 


the choice of uncertainty set, see 


tom 


Bertsimas and Sim 

(2004) 

. Chen et al. 

(2007 

) and 

Bertsimas et al. 


2.3. Distributionally Robust Models 

In practice, it is often the case that one only has partial information about the distribution of 
such as first and second moments. Applying stochastic programming in those cases is not feasible. 
Meanwhile, using a robust optimization approach is not straightforward either and often results in 
overly-conservative solutions. As a result, an intermediate path has been proposed which is called 
the distributionally robust optimization (DRO) model. In the DRO models, the decision maker 
constructs an uncertainty set of the underling distribution and minimizes the expected value of 
the objective function under the worst-case distribution chosen from That is, one considers the 
following problem: 


min max 


( 6 ) 


where denotes the expectation of if(x,^) when F is the distribut ion of 


Distributionally robust model was first proposed in 


Scarf et al 


( 19581) . After its introduction. 


several different choices of the uncertainty set J- have been 


cussed types of uncertainty sets below and refer readers to 
therein for other types of uncertainty sets. 


proposed. We wi 


Wiesema.nn et a,] 


1 review two most dis- 


(120141) and references 


Distributio n s with partial knowledge o n moments 


Prekopal (119951 ) . 


moments. Specifically, 


Scarf et al. 


Scarf et al. 

( 1958 ) 

. Dunacova 

(1987) 


Bertsimas and Ponescu (120051 ). etc. consider a family of distributions with known 


()l9581 1 consider the following optimization problem (in the con¬ 


text of an inventory problem): 


min max Ef.^c’[c(x — O'*" + 

a:eK FGF{tJ.o,<7o) « ^ " 
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/ 

P(^GR+) = 1 

where T{fj,o,ao) = < 

f(0gm 

= fJ-O 


V 



(7) 


In dH), is the set of all probability measur es in the probability space where ^ is defined and c 


and r are given parameters. 


Scarf et al. 


(jl958ll show that the worst-case distribution is a two-point 


di stribution with given decision x and derive a closed-form solution to ([7]) . 


Pelage and Yel ()2niflt consider a more general form of moment constraints. Namely, the uncer¬ 


tainty set is constructed as: 



/ 

P(^ G H) = 1 

J'(H,/Xo, So, 71,72) = < 


- /Xo)^So"'(1E«~f(^) - /xo) ^ 7i 


V 

—/Xo)(^ —/Xo)”^] ^ 72S0 


( 8 ) 


Pelage and Yd ((201^ show that problem ([ 8 ]) can be solved by a convex optimization problem. 


They also provide a data-driven method fo r choosing the parameters 71 and 72 . 


More recently, 


Wiesemann et al. 


(12014 1 consider a general model which incorporates both 


moment information and support information. The distributional uncertainty set considered is: 


F= < 

/ 

FGAd(R' xR™) 



V 

IP€-.-f[(^,J?) GC,] G [p,,pJ,ViGxj 


(9) 


where A4(M* x M'") represents probability distribution on x R™, ^ G R^ is the random term, 
77 G R"* is an auxiliary random vector, AgR^^^ SgR^^™, 6 gR^ are predetermined parameters 
and Ci are conic confidence sets. It is shown that when certain conditions are satisfied, model Q 
with distributional uncertainty set ([9]) can be solved b y conic progra i nming . 


Distributions related to a given distribution. 

tributions that arise from (;/i-divergence. Namely, 


Ben-Tal et al 


(120131) consider a set of dis- 




70(/,/Kp,j;/. = l,/z^O,i = l,2,..., 


m 


( 10 ) 
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where / is a given probability vector (e.g., the em pirical distri b ution vector). f ) is the (j)- 


2005|). 


Ben-Tal et al 


( 2013 1 show that 


divergence between two probability vectors / and / (jPardo 
the robust linear constraint 

(a + E^^^(0)^x^/3, VFG.F, 

can be written equivalently as a linear constraint, a conic constraint, or a convex constraint, 
depending on the choice of cj), where a S M” and /3 £ R are fixed parameter s, x £ R” is the dec ision 


variable and J - is de fined in m- Similar approaches have been taken by 


K 


abja^^et^ 


(12013h. 


Bertsimas et al 


Wang et al 


(1201 3h and 


(120141 ) propose a model where the distributional uncertainty set is constructed 


by means of hypothesis test given a set of available data. Namely, two hypotheses are compared: 


i^o:p* = Po vs. Ha :p*^Po, 


where Hq is the null hypothesis, Ha is the alternative hypothesis and po is an arbitrary distribution. 
By specifying a hypothesis test, e.g. x^-test, G-test, etc., and a confidence level e, one can construct 
a distribution set J- containing all the distributions that pass the test under the given set of data. 


2.4. Other Choices of Objectives 


In addition to the models mentioned above, there are several other models for decision making 
under uncertainty that have been s tudied in the lite r ature. 

Minimizing worst-case VaR. 


El Ghaoui et al. 


(120031 ) consider the problem of minimizing 


VaR (defined in ([5])) over a portfolio of random loss where only partial knowledge about the 
distribution E of ^ is known. Mathematically, the optimization problem is: 


min t 

X,t 


s.t. ^ t) ^ 1 — (5, 


( 11 ) 


x£ A" 
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Using the exact Chebyshev bound given in 


show that problem (fTT]) can be reduced to a second-order cone program (SOCP). 


Bertsimas and Pooescu 

(2nn,T 

:iiLO M-n aiiu 

. El Ghaoui et al. 

(2003) 


El Ghaoui et ah 


((20031) further show that if is a set of distributions with the first and second moments (//, S) 


satisfying (/r, S) G conv{(/ri, Si), (/r 2 , S 2 ),..., (/Ufe, Sfc)} or (/U, S) is bounded in a componentwise 


fashion, problem (llip can be solved 

Minimizing worst-case CVaR 


)V an SOCP or a semi-defin ite program (SDP) respectively. 


Zhu and Fukushimal ((20091) solve the problem of optimizing 


the CVaR (defined in ((K)) of a portfolio when the distribution F of the random return ^ is only 
known to belong to an uncertainty set F instead of being exactly known, namely. 


min max CVaRfsiRCx,^)], (12) 

xeV FG.F , L \ /j 


where CVaRpAf-) denote the J-CVaR of a random variable whose distribution is F. 


Zhu and Fukushimal (j2009l) show that for certain forms of F, problem (I12|) can be transformed to 


a convex optimization problem. 

3. A Composite Risk Measure Framework 


In this section, we present a unified framework for decision making problem under uncertainty. 
Our framework encompasses all the decision paradigms discussed in Section [2] and can be used to 
generate new ones. 

The idea of our framework is based on risk measures defined as follows: 


Definition 1. Let £ be a set of random variables defined on the sample space D. A functional 
p{-) : £ —)• M is a risk measure if it satisfies the following properties: 

1. Monotonicity: For any X, Y G £, if X ^ Y, then /o(X) ^ /o(Y), where X ^ Y means that 
X(a;) ^ Y (w) for any w G D. 

2. Translation invariance: For any X G £ and c G M, p(X. -t- c) = /o(X) -|- c. 


In addition, 


Artzner et al. 


()l999ll define a subset of risk measures satisfying some structural 


properties presented as follows. 
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Definition 2. If a risk measure p{-) satisfies the following properties: 

1. Convexity: For any X, Y G £ and A G [0,1], p(AX + (1 — A)Y) ^ Ap(X) + (1 — A)/3(Y); 

2. Positive homogeneity: For any X G £ and A ^0, p(AX) = Ap(X), 

then p{-) is called a cohere nt risk measure. If only convexity holds, it is called a convex risk measure 


( Follmer and Schied 


200211 ■ 


To establish our framework, first we note that given x, if(x, is a random variable defined on 
the sample space of ^ (which we denote by fig). We denote such a random variable by Y (x). Define 
gpi') to be a risk measure for Y (x), where the subscript F is used to show the dependence of this 
risk measure on the choice of distribution F. Now we further define a measurable space for F\ 
(Di,Si), where Di denotes the space of all the distribution functions for and Si is a cr-algebra 
defined on the space of such distributions. Moreover, we can define a measure Pi on such a space 
using concepts from Bayesian statistics (see the following passage for detailed discussions). With 
this definition, the risk measure gp(Y{x)) can be viewed as a random variable too in the following 
way: 


Z(x):FgDi ^ c?i.(Y(x))gM. (13) 

We denote the linear space of Z(x) by Z. Finally, since Z(x) is a random variable, we can apply 
another risk measure p: Z and consider the following optimization problem: 

min p{Z{x)). (14) 

Therefore, our framework can be written as follows: 


min p{gp{H{x,$))). (15) 

We call this the composite-risk-measure (CRM) framework for decision making under uncertainty. 
In the following discussions, we will refer to gp{-) as the inner risk measure and p{-) as the outer 
risk measure. We first present the following tractability result for problem (|15p . 
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Proposition 1. Optimization problem m is a convex optimization problem if the following 
holds: 

1 . i^(x, is convex in x; 

2. X is convex; 

3. ^(•) is a convex risk measure; 

4- dpi') is a convex risk measure for each F GiF. 

Proof. We show that (x, ^))) is a convex function of x. For any x, y e and 0 ^ A ^ 1, 

we have: 


+ (1 - X)p.{gF{H{y,$))) ^ p,{XgF{H{^,^)) + (1 - X)gF{H{y,^))) 

lx{gF{XH{x,^) + {l-X)H{y,$))) 

> PigF{H{Xx + {l-X)y,$))), 


where the first line follows from the convexity of /i(-); the second line follows from the convexity 
of 5 f(') and the monotonicity of the third line follows from the convexity of and the 

monotonicity of gF{-) and /i(-)- D 

Now we turn to the distribution Pi over Ali. We make the following assumption in our discussion: 


Assumption 1. Oi is parameterized by a finite number of parameters. 

In fact, Assumption [T] does not cause much loss of generality. Many distribution families we are 
interested in are parameterized by a finite number of parameters. For example, if F is the family 
of discrete distributions whose probability mass function is: P(^ = =pi,i = 1) w-, then 


^1 = < {Pl,P2,---,Pm) ^ 


^p, = l;p,^0,Vi = l,2,..., 


m 


For family of multivariate normal distributions: 


dF{^) = -j= ^ exp(-^(^ - fifU 

V(2^) 1^1 ^ 
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we have 




where 5^ is the cone of positive semi-definite matrices. For family of mixture distributions dF{^) = 
= 1) ^ 0) * = Ij 2,m, where = 1,2, m, are predetermined distri¬ 

butions, we have 


Ai = 1; Ai ^ 0,Vi = 1,2,... ,m | . 

In the framework of Bayesian statistics, observations of ^ are treated as given rather than samples 
randomly drawn from an underlying distribution. Meanwhile, the parameters of distribution are 
handled as random variables which reflect the likelihood it takes each value given the observations. 
We denote the distribution parameters as 0 = (0i,..., Om) G fli. To obtain the distribution of 0, 
one should hrst specify a prior distribution /(0) which expresses one’s belief about 0 when no 
observation is available. Prior distribution can be either informative or uninformative. Then, when 
observed data H = (^i, ...,^Ar) are collected, one can derive the posterior distribution p(0|H) using 
Bayes’ formula: 



p(0|S) 


/o,/(0)nLM^.i©)d0’ 


(16) 


where p(0|S) is the posterior distribution given data H, is the likelihood function 

and /(©) n!! _^L(^i|0)d0 is the normalizing factor. Note that the integration in (I16p should 
be replaced by summation when a discrete distribution instead of a continuous distribution is 
considered. In practice, if one wants to sample from the posterior distribution in ()16p . according 
to the Metropolis-Bastings algorithm, one only needs to be able to compute the numerator, which 
is usually easy to do. In this way, the distribution Pi over can be defined and sampled from 
easily, and it is often the case that such dehnitions are data-driven. 
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3.1. Relation to the Models in Section [2] 

In the following, we show that all the optimization models discussed in Section El can be viewed as 
special cases of our proposed composite risk measure framework. 

Stochastic programming. In stochastic programming models, we assumed that we know the 
distribution Fq exactly, namely, F = {Fq}. Therefore, models ([I]), (l3|) and dH) can be viewed as the 
outer risk measure taken to be a singleton: 

fJ-i9Fi-))=gFoi-), 

and the inner risk measures are chosen to be expectation, VaR and CVaR, respectively. 

Distributionally robust optimization. In the distributionally robust optimization models, 
the inner measure is chosen to be the expectation measure, while the outer measure can be viewed 
as the worst-case risk measure: 


WC(Z) = inf {a |P(Z ^ a) = 1} , 


(17) 


where Z is a random variable defined on (Hi, Si,Pi) with fli chosen to be the distribution set F. 
In fact, distributionally robust model also covers the singleton-coherent risk measure model: 


min /i(Y(x)). 

X 


Artzner et al. 


(119991 ) show that coherent risk measures are closely related to worst-case risk mea¬ 


sures. The exact relationship is given in the following theorem (we use <C to denote absolute 
continuity between probability measures). 


Theorem 1. Let F be any linear space of random variables defined on a probability space (fl, S,P). 
A functional p{-) : T—)-M is a coherent risk measure if and only if there exists a family of probability 
measures Q with Q ^ P for all Q & Q such that 


p(X) = sup Eq(X), VXgT, 

QeQ 

where Eq(X) denotes the expectation of the random variable X under the measure Q (as opposed 
to the measure ofX. itself). 
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For example, for CVaR 5 (-), the corresponding set of distribution is Q = 
{Q <C P IdQ/dP ^ (1 — 5)“^}. The above theorem shows that coherent risk measures can be 
represented by the worst-case expectations taken over a set of probability distribution s. The proof 
of Theorem [1] in fact predates the introduction of coherent risk measure, see 


Hubeu (I1981I1 . And 


this resu. 
see, e.g.. 


t has been used in several recent works that study distributionally robust optimizations. 


Bertsimas and Brown 

(2009 

) and 

Nataraian et al. 

(2009) 


Robust optimization. By using the distribution set T where each F G is a distribution 
putting all its weight on one point ^ G H, the distributionally robust model ([6|) reduces to a robust 
optimization model ([5|) which falls into our framework. 

Worst-Case CVaR and VaR optimization. Comparing model (fTT]) and (fT^ to the unified 
model (|15D . we see that the corresponding inner risk measures are VaR and CVaR, respectively, 
while the outer risk measure is the worst-case risk measure. 

By choosing different combinations of outer and inner risk measures, one can come up with more 
optimization models. However, some of those models reduce to the models above after transforma¬ 
tion. Some examples of these models are presented as follows. 

Two-fold expectation as expectation. Choosing outer risk measure /r(-) and inner risk mea¬ 
sure gpi') both as expectations, we obtain the following optimization model: 


min E(E^...f[R(x,^)]). 


Denote the cumulative distribution function of the random term ^ by Fe(^), we have 


(18) 


E(E^.^[R(x,0]) = / / H{-K,i)dFe{i) p{Q)dQ 

= / R(x,o[/ p(0)dFe(Ode 

Jrs L-'Oi 

= [ R(x,^)dFe(0 
= E[R(x,0], 
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Table 1 Different composites of risk measures 


9 f {-) 

K-) 

Expectation 

CVaR 

VaR 

Worst-Case 

Singleton 

m 

0 

0 

© 

Expectation 


X 

X 

X 

VaR 

X 

X 

X 

X 

Worst-Case 


mi) 

mi) 

~© 


where dFe = p{Q)dFQdQ can be viewed as the expected measure (e.g., if fli contains only con¬ 
tinuous distributions, then this is just a weighted average over all the density functions). Therefore, 
problem (|18p is equivalent to the expectation optimization problem: 


mn E[if(x,^)]. 

Two-fold worst-case as worst-case. Choosing outer risk measure/r(-) and inner risk measure 
gpi') both as worst-case risk measures, we obtain the following optimization model: 


min max max/fix, f), 
xga- Fe.F ieSp ^ 


(19) 


where H/t- is the uncertainty set for ^ when the distribution of ^ is F. This problem can be reduced 
to model Q: 

min max/f(x,f), (20) 

xga: ces ^ ^ ^ 

where E = :3Fo £ gEfA- Example of this model can be found in [ 


Bertsimas et ah 


(1201411 . 


All the cases discussed in this section are summarized in Table [H where the symbol ~ means that 
the model is equivalent to another model labeled by the number, the symbol x means that such 
combination of risk measures has not been considered yet. 

4. Constructing New Models 


In this section, we use our framework to propose and study a few new paradigms for decision 
making under uncertainty. In the following, we continue to use the notation Hi to denote the 
sample space of F and Pi to denote the probability distribution over Hi. As we have discussed 
earlier. Pi can be derived as a posterior distribution from a Bayesian approach. 
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4.1. Minimizing VaR-Expectation 

When minimizing the expectation of random loss if(x, under distributional uncertainty, we can 
choose the outer risk measure as <5-VaR to provide a probabilistic guarantee: 


min \aRs{E^^p[H{-K,^)]), 


( 21 ) 


where V is the feasible set of x. This model can be interpreted as finding a decision variable to 
minimize the threshold such that the chance that the expected loss exceeds the threshold is small, 
or in other words, this model can be viewed as minimizing the upper bound for the one-sided 
5-confidence interval for the expected loss. It could be applicable in the context where the expected 
value is a common criterion to evaluate the loss while there is uncertainty about the underlying 
distribution. 

Note that model (|2ip shares similar spirit as the distributionally robust model ([6]). Both models 
are designed to deal w ith parameter uncertainty. H owever, in distributionally robust models such as 


Pelage and Yel ( 201011 and 


Bertsimas et al 


()2014 1 , it is assumed that there exists a true underlying 


distribution T of ^ and the distribution set J- is chosen as a confidence region of F to hedge 
against uncertainty. And the distribution set does not depend on the decision x. In contrast, in 
m, the distribution set (for the VaR) is dependent on x. This is often desirable since for different 
X, the objective function gF{H{x,^)) may have different properties, thus the set of unfavorable 
distributions may differ. As a result, solving problem (|2ip leads to a less conservative solution 
under the same robust level. To illustrate, we denote the optimal solution, the optimal value 
and the corresponding distribution set of problem ([6P and problem ([^Tp by J^dr) and 

(^VaR)7vaR)-^x) respectively, then (xy^j^,.T*) is the optimal solution to the following problem: 


min sup(E^...j.[i7(x,^)]) 

p^jr 

s.t. Pi(F0.T) ^ 1-5, 


( 22 ) 


while (xqj^, JVir) is only a feasible solution. Therefore, we have 7vaR ^ 7dr- 


We make the following assumptions in this subsection. 
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Assumption 2. 

1. The loss function if(x, is (piecewise) continuously differentiable and convex in x on X. 

2. Both¥,^^F[H{x,^)] and E^^i7’[Vxi^(x,^)] can be evaluated efficiently. 

The first item in Assumption [5] is necessary. Otherwise there is little hope to solve problem (|2ip 
efficiently even when the risk measures over the random loss are dropped. The second assumption 
ensures that standard optimization techniques can be employed to solve problem (I21|) . When the 
second assumption is violated, as in the case where E^.^ir[id(x, ^)] does not have a closed-form 
expression and the dimension of x is high, one must turn to sample-based methods to evaluate 
E^.^j’[id(x,^)]. Consequently, the size of the problem will be large. For those cases, we will propose 
an approximation method in the next subsection. 

We rewrite problem (|2ip as a chance-constrained problem: 


min t 


(23) 


s.t. Pi(E^...ir[id(x,^)] ^ l-(5. 


Thus, the optimal solution of problem (|21l) x* represents an optimal decision where the <5 quantile 
of the distribution of E^.^;’[id(x, ^)] is minimized (note that the random variable here is the dis¬ 
tribution parameter 0). A general approach to tackle problem ([^3P is the sample approximation 


approach (SAA) (see, e.g.. 


Nemirovski and Shapiro 


20061. 


Luedtke and Ahmed 


2008). Using Monte 


Carlo method to generate N i.i.d. samples of distribution parameter Qi,i = 1,..., N from distri¬ 
bution Pi, problem (12311 can be approximated by the following mixed integer nonlinear program 
(MINLP): 


min t 


s.t. -Mzi-i-E^.^Fj-f^(x,^)] ^ t, z = l,2,...,iV 


N 




(24) 
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where Fi is the distribution parameterized by 0^, and M is a large constant which should be 
larger than maXxgAf,i=i,....ArIE^-.Fj-f^(x,^)] —minxeAr,i=i,...,ivlE$~Fj-f^(x,^)]. When X is bounded, for 
example, there exists a finite-valued M. 

The rationale of this approximation approach is quite clear. By generating N instances of 0, 
a uniform distribution over Qi,i = instead of Pi is considered when minimizing the S- 

quantile of the expectation. When N is sufficiently large, problem (I24|) can approximate problem 
()23l) well by the law of large numbers. Specifically, we have the following accuracy and feasibility 
bound which provides a guideline for choosing suitable given desired accuracy levels. The next 


theorem largely follows the results in 


Luedtke and AhmedI ( 2008 1. 


Theorem 2. Suppose that X is bounded with diameter D and is globally Lipschitz 

continuous with Lipschitz constant L for all F £ F. Assume that both problems and [2^ 
are feasible and their optimal values are finite. Denote (2^ and [2^ by Ps and Pn,s, o.nd their 
optimal solutions (only the x part) and optimal values by (x|,tj) and (x^ , 5 ,t^ , 5 ), respectively. Let 
0 < T ^ 1 — d, e G (0,1), 7 > 0, 


A^n = 


log 


+ nlog 


2LD 

7 


Tog 


(25) 


Then, when N ^ Nq, with probability at least 1 — 2e, we have: 




(26) 


in 


Proof. When N ^ iVn ^ the upper bound in (1261) is given by Th e orem 3 

Luedtke and AhmedI (l2008l) . Also, it follows from Theorem 10 in iLTiedtke and AhmedI (1200811 that 
a feasible (x, t) for problem: 


min t 

t,x,zie{o,i} 


( 27 ) 


s.t. -Mzi-k 7 ^ t, i = l,2,...,N 


N 


J]z,= L(l-<5)lVj 
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is feasible for problem Ps-r with at least 1 — e probability. Notice that the optimal t for problem 
(127)1 equals + 7- Thus, when the feasible solution (x, t) for problem (f27l) is feasible for problem 
Ps-Ti we have ,5 + 7 ^ This completes the proof. □ 

Note that in the above conditiopns, the logarithms are taken over e and 7 , therefore we can choose 
very small values of them without increasing the number of necessary sampl es very much. Mean¬ 


while . experiments show that the bound in Theorem [2] is very conservative (|Luedtke and Ahmed 


20081 ) and one can choose a much smaller N in practice. 


Numerically, software packages such as CPLEX (mixed integer linear problems) and MOSEK 
(certain MINLP problems) can be used to solve moderate sized problems. When Assumption [2] 
holds, problem (12411 may be solved by those solvers. 


Remark 1. It is worth pointing out that model ([24)1 can also be used to solve a family of VaR-VaR 
problems and VaR-CVaR problems. Eor portfolio optimization problems where = ^^x and 

the distribution of ^ is a normal distribution with mean ^ and covarian ce P, both VaR 5 (i 7 (x,^)) 


and CVaR 5 (i 7 (x,^)) have closed-form expressions ([Sarvkalin et al 


20081 1: 


VaR5(^^x) = ^((5)Vx^Tx-|-/j^x, (28) 

CVaR5(^^x) = —==-^-- exp(-(d>-i((5)) V2)V x'^Px + /^^x, (29) 

V 27 r(l — 0) 

where d>~^(-) is the inverse of the cumulative distribution function of a standard normal distribu¬ 
tion. In this case, we can minimize VaR^-VaRj and VaR^-CVaRe by solving the following problems: 

min VaRi ^<I)~^(e)\/x^rx-|-/x^x^ , (30) 

min --exp(-($-i(e))V 2 )Vx^rx/lO , (31) 

Vv 2 vr(l-e) / 

both of which can be solved approximately by mixed integer second order cone program (MISOCP) 
using SAA approach. 

Remark 2. It is also worth mentioning that when the objective function i7(x, ^) is linear in ^ (such 
as in the portfolio selection problem), the VaR-Expectation model is similar to a VaR model in for¬ 
mulation. More precisely, suppose =^^/(x), then VaR 5 (E^[i 7 (x,^)]) =Va.Rs{{E^)'^f{x)). 
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a formulation has been studied in 


Cui et al. 

(2013 

) and 

Wen et al. 

(2013) 


portfolio selection problem. In particular, 


Wen et al. 


(j2013[l in the context of the 


(120131 1 show that an alternating direction 


augmented Lagrangian method (ADM) can be applied to solve the problem very efficiently in that 
case. However, the VaR-Expectation model is different from the VaR model if the objective is not 
linear in ^ since the expectation can no longer be taken into Moreover, even in the linear case, 
the VaR-Expectation model has a very different interpretation from the VaR model: In the VaR 
model, the uncertainty is directly in and one has to assume a certain distribution of ^ (which 
is usually estimated from an empirical distribution); while in the VaR-Expectation model, the 
uncertainty is in the distribution of Such different interpretations would usually lead to different 
uncertainty sets even with the same set of observations. 


4.2. Minimizing CVaR-CVaR and CVaR-Expectation 

Using the same idea as above, we formulate a robust model for CVaR optimization by choosing 
/i(-) as (5-VaR and 5 f(') as e-CVaR, namely, 

min t 

s.t. Pi(CVaR,(H(x,^j.)) ^t) ^ l-(5, 

where Pi(-) is the probability measure of F, and means that the distribution of ^ is F. This 
model can be viewed as minimizing the upper bound for the one-sided (5-confidence interval for the 
CVaR. 

For most cases, it is impossible to derive a closed-form expression for CVaR. In practice, sample 
based methods like sample average approximation (SAA) are widely used to compute CVaR. To 
ensure the accuracy of evaluation, the number of samples is typically large (the exact number of 
necessary samples will be discussed later). Therefore, if we directly replace expectation by CVaR 
in model (|24p . the resulting problem will be a large-sized mixed integer program, which is difficult 
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to be solved. The same situation occurs in a VaR-Expectation model where the expectation cannot 
be evaluated directly and must be approximated by sample average. 

To deal with the issue mentioned above, we relax the outer VaR to CVaR, leading to a CVaR- 
CVaR (nested CVaR) model: 


min a + 

xGR",aGR 


^E[(CVaR,(i^(x,^^)) 


«)+]. 


(32) 


Similarly, we can consider the following CVaR-Expectation model instead of the VaR-Expectation 
model: 


min CVaR 5 (E£...,i^’[/f(x,^)]). 

xGA' 


The reason for making the above relaxations is two fold. First, 


(33) 


DelbaenI ( 2002 1 shows that 


CYaKs{Z) is the smallest upper bound of YaKsiZ) among all coherent risk measures that depend 
only on the distribution of Z. By relaxing the original problem to convex programs, a wide range 
of convex optimization algorithms can be employed to solve them. Second, optimization problems 


similar to model (|32ll have been inv estigated in the literature in the c ontext of risk averse mu. 


stochastic programming (see, e.g.. 


Guigues and Romisch 


2012. and 


Philpott and de Matos 


tistag e 


2012ll . 


thus similar algorithms and techniques can be employed. In addition, since model ()32p and (|33p 
take into account the extent of losses in the most adverse scenarios, they are also meaningful in 
their own right. In the following, we present a general regime to solve problem (I32p using SAA 
approach. Since the discussion of model (l3^ resembles that of model (f3^ and is simpler, we will 
only discuss model (l3^ . 


SAA is a popular meth od in sto c 


optimization problems, see 


rastic pro gramming and has 


oeen used to deal with CVaR 


Shapirol (1200611 and 


Wang and Ahmed 


2008). The concept of SAA is 


to approximate an expectation by the average of many samples generated by Monte Carlo method 
or other schemes. In problem (I32p . we have 


E[(CVaR,(R'(x,^))-a)- 


1 

V 


N 


J](CVaR,(F(x,^^J) 


— a] 
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where N i.i.d. samples of 0: 0i,..., 0jv are generated from the distribution of 0 and the parameter 
of the distribution of is 0^. Then, the original optimization problem can be approximated by 
the following problem: 


min a + -- 

agB,xeR",ueRJV (1 — o)N 


T 

e u 


(34) 


s.t. Mi ^ CVaR,(if(x,^j.J)-a, i = l,2,...,N 


Ui^O, z = 1,2, 


where e denotes a vector of all ones. Notice that problem (f3l|) is a CVaR constrained problem, 
and the CVaR in the constraint can also be approximated using SAA, resulting in the following 
problem: 


min 

a,x.u,v 

S.t. 




{1-S)N 


T 

e u 


M 


Ui^V^ + 






- Mi ' - a, 


Mi^O, i = 1,2,A^, 


z = l,2,...,Ai 


(35) 


where v = (ui,... = 1,..., V are auxiliary variables and (^^,,... = 1,... ,N are i.i.d. 

samples generated from Fi. When if(x, ^) is convex in x, problem (I35|l is a convex program with 
a linear objective function and can be solve d efficiently for large size d problems. Now we turn 


to the number of samples, N and M. For M, 


Wang and AhmedI (j2008fl shows that under certain 


regularity conditions, there is at least 1 — e probability that the SAA of CVaR lies within the 
7 -neighborhood of CVaR when 




U{H,F) 


T 


C 2 iH,F)n + C,{H,F)log{ - 


(36) 


where Ci{H,F), C 2 {H,F) and C^{H,F) are constants for a given objective function i^(x,^) and 


distribution N, n is the dimension of x. Since the constants are typically di fficult to calcu 


bound in (|36l) serves as a benchmark to estimate the order of M. For N, 
that under mild conditions, the sample size 


Shapirol (j2006[l proves 


ate, the 




rf 


nlog 


T>2(g,Pl) 

7 


+ log 




(37) 
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ensures that the solution of problem (|34l) lies within the 7 -neighborhood of the solution of problem 
fl32p with probability 1 — e. Here D 2 {H,Pi) and DaPif,Pi) are constants for a given 

objective function and distribution Pi (of F), n is the dimension of x. 

Using the estimates in (f36l) and ((371) . we can see that the size of proble m (15^ is quite large. In 


pra ctice, there are many w avs to accelerate the computation. For instance, 


and 


Alexander et al 


fenndi 


Xu and Zhand (j2009fl show that smoothing the CVaR can enhance the computation efficiency 


up to several times. 


Remark 3. When the expression inside CVaR can be evaluated and differentiated easily, we only 
need to sample from distribution Pi and the resulting problem is much easier to solve. For instance, 
consider the following models; 


min 

xGA’ 

CVaR 5 (VaR,(^^x)), 

(38) 

min 

CVaR 5 (CVaR,(^^x)), 

(39) 


where the distribution of ^ is normal. Since VaR£(^^x) and CVaRe(^^x) have closed-form expres¬ 
sions (given in ([251) . (1221) 1. problem (138[) and (l39l) can be solved by second order cone pro¬ 


grams (SOCP). For CVaR-Expectation problems, when the distributio n of £ is discrete 


(1331) reduces to a mean-CVaR problem and can be solved efficiently (jlvengar and Ma 


Wen et al 


2OI3II . 


pro 


2013 


jlem 

and 


Having filled in several empty entries in Tabled! we have Tabled In Tabled the symbol ~ means 
that the model is equivalent to another model labeled by the number and the symbol x means 
that such combination of risk measures has not been considered yet. (In Table [21 Var-Worst-Case 
and CVaR-Worst-Case model can be viewed as VaR-CVaR and CVaR-CVaR models, respectively, 
with the 5 in the inner CVaR being 1.) 

5. Numerical Experiments 

In this section, we conduct numerical experiments to demonstrate the tractability and effectiveness 
of the models proposed in Sectional In particular, we consider the VaR-Expectation model, the 
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Table 2 Different composites of risk measures: a full table 


9f{-) 

m(-) 

Expectation 

CVaR 

VaR 

Worst-Case 

Singleton 

© 

® 

m 

© 

Expectation 


X 

X 

X 

VaR 

m 

EJ 

(|30l) 

-El 

Worst-Case 

® 

Ha 

El) 


CVaR 

(1331 

(133 

(I3E1) 

~(I33 


CVaR-Expectation model and the CVaR-CVaR model and conduct numerical experiments using 
the portfolio selection problem. In the portfolio selection problem, a decision maker has to choose a 
portfolio to invest using available stocks based on historical data. In particular, our data set contains 
the daily returns of 359 different stocks that are in the S&P 500 index and do not have missing 
data from 2010 to 2011. In the following, we first show that when using a VaR-Expectation model, 
the corresponding distribution sets (for the VaR) depend on the decision, which is a distinguishing 
feature of the model. We show that this feat ure makes the VaR-Ex pectation model less conservative 


than the distributionally robust model in 


Del age and Yel (j201Clfl . Then we demonstrate that all 


the three models can be computed efficiently even when the sample size is large enough to ensure 
the precision of the solution. Finally, we compare the resulting returns of the three models with 
existing models. 


5.1. Comparing the VaR-Expectation model and the distributionally robust model 

We have discussed in Section 0] that one important feature of our proposed approach, the VaR- 
Expectation model, is that the distribution set (for the VaR) depends on the decision x and 
therefore the obtained solution from this model will be less conservative than that of a traditional 
distributionally robust model in which the distribution set is independent of x. Here we illustrate 
this feature using numerical examples. Suppose a decision maker builds a portfolio of n stocks 
using the VaR-Expectation model. The joint distribution of the stocks is parameterized by an 
n-dimensional normal distribution with mean and covariance matrix S. At each decision time, 
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we use the returns in the past t days to derive the posterior distribution for /x and S. Let /Xq and 
Sq denote the mean and covariance of the empirical distribution of the stocks in the past t days. 
Using multivariate Jeffereys prior density as the prior distribution (which is the uninformative 
prior for normal dist ributions), we have that the posterior distribution Pi of /x and S is given by 


( Gelman et al 


201311 : 


/(/x,S) =AA(/.t|/Xo,riS)W-i(S|tSo,t- 1), 


(40) 


where AA(/i|/io, is the probability density function of a multivariate normal distribution with 

mean /Iq and covariance matrix and — 1) is the probability density function 

of an inverse-Wishart distribution with scale matrix tT,Q and degree of freedom t — 1. In the fol¬ 
lowing, we set t = 30. For each fixed decision (allocation) x, the worst-case distribution Fo(x) = 
AA(/io(x), Ilo(x)) of loss ^ satisfies the following condition: 


Pi(E^[^^x]^E^3(.)[^^x]) = 1-,5, 


or equivalently, 


Pi(/x])x ^ /lo(x)'^x) = 1 - (5. 

Thus, the distribution set of the VaR-Expectation model is: 

= {Ar(R, S)|/i'^x ^ /io(x)^x} , 

which depends on x. The optimization problem under the VaR-Expectation model is (we assume 
that the total investment amount must equal to 1): 


max /io(x)^x. 

x^0.e^x=l 


(41) 


To compar e to the distributional 
function in IPelage and Yel (j2010fl 


y robust model in iDelage and Yd ((201^, we note that the objective 


is 


max min «pX 

x3:0,eTx=l 


( 42 ) 
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Figure 1 


Comparison between the VaR-Expectation model and the DRO model. 




where 


F = 


F{^)eM 


— fio) ^ 7i 

— Ato)(^ — ^ 73^0 — 


where 71 and 72 are chosen according to the discussions in 




Pelage and Yd (j2010r) to ensure that with 


probability <5, J- contains the true distribution. In the following, we fix (5 = 0.95. We perform the 
following procedures: we randomly pick two stocks and use their empirical returns from 3/4/2010 
to 4/14/2010 to fit a normal distribu tion. Then we draw 10® data from the distribution to form 


the data set. (In 


Pelage and Ye 


2 OIOL it usually needs at least 10 ® data points to get a valid 71 and 


72 , therefore we have to take this bootstrapping method.) Then we solve (|4ip and (14211 respectively 
using these data. By the discussions in Section ICTl the optimal value of (f^TI) should be larger than 
that of (14211 . In our numerical experiment, we repeat the above procedures 1000 times and plot the 
result in Figured] The left figure in Fig ure [J shows a scatter plot of the optimal values obtained 


by the distributionally robust model in 


Pelage and Yel (j2010ll (x-axis) and the VaR-Expectation 


model (y-axis) in the 1000 experiments, and the right figure shows the distribution of the difference 
of the optimal values in the same set of experiments. From Figure dl we can see that the VaR- 


Expectation model results in higher value in all the cases, meaning that it is less conservative. 
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Figure 2 Comparison of distribution sets. 


The DRO case (shaded area). The JP[o,25,0.75] case (shaded area). The .^^[0,75,0.25] case (shaded area). 



Indeed, the average optimal value of (f4T]i in the 1000 experiments is 0.070% larger than that of 
()12]l . Moreover, for one particular experiment, we plot the projection of %'[o. 75 ,o. 25]5 %'[o. 25 ,o. 75 ] and 
the distribution set of the DRO model on the plane spanned by /x in Figured) From Figured) we 
can see that in the DRO approach, the distribution set is independent of the choice of x. However, 
in the VaR-Expectation model, the distribution set changes with the choice of x. As we have 
mentioned above, it is such feature of the VaR-Expectation model that makes the solution less 
conservative yet with the same level of probabilistic guarantee. 


5.2. Solving the composite risk measure model 


Under the setting in Section lOl the VaR-Expectation mode, 
tion augmented Lagrangian method (ADM, see IWen et al, 


can b e solved by the alternating direc- 


20131 ). the CVaR-Expectation model 


can be solved by an LP, and the CVaR-CVaR model can be solved by an SOCP. We evaluate the 
performance of these three models with different stock number n and sample size N. Our experi¬ 
ments are performed on a laptop with 8.00 GB of RAM and 2.20 GHz processor, using MOSEK 
with the Matlab interface. 

We first solve the VaR-Expectation, CVaR-Expectation and CVaR-CVaR models for different 


sample size N when t = 30 and ra = 4, while both outer and inner risk levels are chosen as 0.95. 
The stocks are randomly chosen from all 359 stocks and the period we consider is from 3/4/2010 
to 4/14/2010. We perform the experiment on the same set of stocks 100 times, and the results are 
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Table 3 Computation time and precision of different CRM models when n = 4. 


^samples 

VaR-Expectation 

CVaR-Expectation 


CVaR-CVaR 


N 

ave 

std 

time 

ave 

std 

time 

ave 

std 

time 

5000 

0.0005 

1.91 X 10"^ 

5.12 

-0.0007 

1.44 X lO"'^ 

0.13 

-0.0370 

2.80 X lO"'^ 

0.76 

10000 

0.0004 

1.56 X 10"'‘ 

5.17 

-0.0007 

7.94 X 10"® 

0.16 

-0.0371 

2.01 X lO"'^ 

1.78 

20000 

0.0004 

1.09 X lO""* 

4.99 

-0.0007 

6.68 X 10"® 

0.25 

-0.0371 

1.44 X 10"'‘ 

3.26 

50000 

0.0005 

6.43 X 10"® 

7.58 

-0.0007 

4.20 X 10"® 

0.70 

-0.0372 

7.94 X 10"® 

10.02 

100000 

0.0004 

4.63 X 10"® 

9.26 

-0.0007 

3.18 X 10"® 

1.65 

-0.0373 

7.17 X 10"® 

17.64 


shown in Tableland Figure [3l In Tabled the first column, denoted by “ave”, for each method 
is the average of the optimal values of the corresponding models, the second column, denoted by 
“std”, is the standard deviation of the optimal values in all experiments, while the third column is 
the average computation time (in seconds). Notice that when N = 100000, all these three models 
can still be computed efficiently. In the mean time, the solutions of SAA problems can approximate 
the solution of the original models very well. Figure [3] displays the scatter plots of the weights of 
the first 3 stocks in the optimal portfolio of all experiments (the weight of the last stock can be 
computed by one minus the total weights of the first three stocks). The result shows that the larger 
N is, the more concentrated the solution is. It indicates that the optimal solution also converges 
as the sample size becomes large. 

Now we fix = 5000 and choose different n, while the period considered and all other parameters 
are the same as in the previous experiment. The results are presented in Table 01 where tave is 
the average computation time in all experiments, is the minimum computation time and tmax 
is the maximum computation time. The models considered are (from left to right in Table 0]): 
VaR-Expectation model, CVaR-Expectation model and CVaR-CVaR model. We can see that our 
new models can still be computed in a reasonably amount of time even when n is large. 

5.3. Comparing CRM models with existing models using real market data 

To test the performance of our new models in real world trading operations, we compare our 
models with existing models for portfolio selection problem. In each experiment, we randomly 
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Figure 3 Optimal allocations of the approximated problems. 


VaR-Exp., N=5000 



CVaR-Exp., N=5000 



0 0 


CVaR-CVaR, N=5000 



VaR-Exp., N=20000 



0 0 


CVaR-Exp., N=20000 



0 0 


CVaR-CVaR, N=20000 



VaR-Exp., N=100000 



0 0 


CVaR-Exp., N=100000 



0 0 


CVaR-CVaR, N=100000 



Table 4 Computation time of different CRM models when N = 5000. 


^stocks 

VaR-Expectation 

CVaR-Expectation 

CVaR-CVaR 

n 

iave 

tmin 

tmax 

tave 

tmin 

^max 

iave 

tmin 

tmax 

10 

4.52 

3.93 

4.97 

0.14 

0.13 

0.20 

3.17 

2.71 

5.72 

20 

4.49 

3.98 

5.07 

0.18 

0.17 

0.22 

6.77 

5.22 

12.48 

30 

5.36 

4.57 

6.16 

0.22 

0.20 

0.25 

12.62 

10.39 

19.43 

40 

6.19 

5.44 

7.07 

0.28 

0.25 

0.32 

19.49 

15.40 

29.15 

50 

6.38 

5.60 

7.11 

0.33 

0.29 

0.39 

29.33 

22.33 

41.29 


choose 4 stocks from all 359 stocks to build a dynamic portfolio during the period from 3/4/2010 
to 4/27/2011 (300 days in total). The portfolio is recomputed everyday using the returns of the 
last 30 days as input data. At each day of the experiment, only the returns of the last 30 days 
can be used. We set sample size as 2000 for VaR-Expectation, CVaR-Ex pectation and CVaR - 


CVaR models, and con ipare the results with 


worst-case VaR model (lEl Ghaoui et al. 


distributionally robust model (iDelage and Ye 


2010h . 


200311 and single stock (SS) model. The single stock model 


chooses the stock that has the highest average return rate in the last 30 days as the sole stock for 
that day, and is used as a naive benchmark here. The average cumulative return of each day is 
shown in Figure HI while the average standard deviation and daily return are presented in Table 
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Figure 4 Comparison of wealth accumulation of all models. 



Table 5 Average return rate and standard deviation of all models. 


VaR-Exp. 

CVaR-Exp. 

CVaR-CVaR 

DRO 

W-C VaR 

SS 

ave return 0.096 % 

0.096 % 

0.087 % 

0.088 % 

0.087 % 

0.078 % 

ave std 1.49 x 10“^ 

1.34 X 10“^ 

1.17 X 10"^ 

1.19 X 10“^ 

1.17 X 10“^ 

2.17 X 10"^ 


[H In this experiment, the volatilities of our models are significantly smaller than that of the SS 
approach. The CVaR-Expectation model and VaR-Expectation model achieve better return rates 
than the other models, while CVaR-CVaR model performs as good as DRO model and worst-case 
VaR model. Though we cannot draw general conclusions about which model is intrinsically better 
without more intensive tests, this experiment shows that the CRM models choose portfolios with 
robust performance, which is a property we desire. 

6. Conclusions 

In this paper, we propose a unified framework for decision making under uncertainty using the 
composite of risk measures. Our focus has been the case where the distribution of uncertain model 
parameters can be parameterized by a finite number of parameters, which includes a large family 
of problems. The generality of our framework allows us to unify several existing models as well as 
to construct new models within the framework. We show through theoretical proofs and numerical 
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experiments that our new paradigms yield less conservative solutions, yet provide the same degree 

of probabilistic guarantee. 

References 

Alexander, S., T. F. Coleman, Y. Li. 2006. Minimizing CVaR and VaR for a portfolio of derivatives. Journal 
of Banking & Finance 30(2) 583-605. 

Artzner, P., F. Delbaen, J. M. Eber, D. Heath. 1999. Coherent measures of risk. Mathematical Finance 9(3) 
203-228. 

Ben-Tal, A., D. den Hertog, A. De Waegenaere, B. Melenberg, G. Rennen. 2013. Robust solutions of 
optimization problems affected by uncertain probabilities. Management Science 59(2) 341-357. 

Ben-Tal, A., L. El Ghaoui, A. Nemirovski. 2009. Robust Optimization. Princeton University Press. 

Ben-Tal, A., A. Nemirovski. 1999. Robust solutions of uncertain linear programs. Operations Research 
Letters 25(1) 1-13. 

Bertsimas, D., D. B. Brown. 2009. Constructing uncertainty sets for robust linear optimization. Operations 
Research 57(6) 1483-1495. 

Bertsimas, D., D. B. Brown, C. Caramanis. 2011. Theory and applications of robust optimization. SIAM 
Review 53(3) 464-501. 

Bertsimas, D., V. Gupta, N. Kalins. 2014. Data-driven robust optimization. Working paper. 

Bertsimas, D., D. Pachamanova, M. Sim. 2004. Robust linear optimization under general norms. Operations 
Research Letters 32(6) 510-516. 

Bertsimas, D., I. Popescu. 2005. Optimal inequalities in probability theory: A convex optimization approach. 
SIAM Journal on Optimization 15(3) 780-804. 

Bertsimas, D., M. Sim. 2004. The price of robustness. Operations Research 52(1) 35-53. 

Birge, J. R., F. Louveaux. 2011. Introduction to Stochastic Programming. Springer. 

Ghen, X., M. Sim, P. Sun. 2007. A robust optimization perspective on stochastic programming. Operations 
Research 55(6) 1058-1071. 



Qian, Wang and W^en: A Composite Risk Measure Framework for Decision Making under Uncertainty 
Article submitted to Operations Research-, manuscript no. (Please, provide the manuscript number!) 


33 


Cui, X., S. Zhu, X. Sun, D. Li. 2013. Nonlinear portfolio selection using approximate parametric value-at-risk. 
Journal of Banking & Finance 37(6) 2124-2139. 

Dantzig, G. B. 1955. Linear programming under uncertainty. Management Science 1(3-4) 197-206. 

Delage, E., Y. Ye. 2010. Distributionally robust optimization under moment uncertainty with application to 
data-driven problems. Operations Research 58(3) 595-612. 

Delbaen, F. 2002. Coherent risk measures on general probability spaces. K. Sandmann, P. J. Schoenbucher, 
eds., Advances in Finance and Stochastics. Springer Berlin Heidelberg, 1-37. 

Dupacova, J. 1987. The minimax approach to stochastic programming and an illustrative application. 
Stochastics: An International Journal of Probability and Stochastic Processes 20(1) 73-88. 

Einhorn, D., A. Brown. 2008. Private profits and socialized risk. Global Association of Risk Professionals 
42 10-26. 

El Ghaoui, L., H. Lebret. 1997. Robust solutions to least-squares problems with uncertain data. SIAM 
Journal on Matrix Analysis and Applications 18(4) 1035-1064. 

El Ghaoui, L., M. Oks, F. Oustry. 2003. Worst-case value-at-risk and robust portfolio optimization: A conic 
programming approach. Operations Research 51(4) 543-556. 

El Ghaoui, L., F. Oustry, H. Lebret. 1998. Robust solutions to uncertain semidefinite programs. SIAM 
Journal on Optimization 9(1) 33-52. 

Follmer, H., A. Schied. 2002. Convex measures of risk and trading constraints. Finance and Stochastics 6(4) 
429-447. 

Gaivoronski, A. A., G. Pflug. 2005. Value-at-risk in portfolio optimization: properties and computational 
approach. Journal of Risk 7(2) 1-31. 

Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, D. B. Rubin. 2013. Bayesian Data Analysis. 
CRC press. 

Guigues, V., W. Romisch. 2012. Sampling-based decomposition methods for multistage stochastic programs 
based on extended polyhedral risk measures. SIAM Journal on Optimization 22(2) 286-312. 

Huber, P. J. 1981. Robust Statistics. Wiley series in probability and mathematical statistics, Wiley, New 


York. 



34 


Qian, Wang and Wen: A Composite Risk Measure Framework for Decision Making under Uncertainty 
Article submitted to Operations Research^ manuscript no. (Please, provide the manuscript number!) 


Iyengar, G., A. K. C. Ma. 2013. Fast gradient descent method for mean-CVaR optimization. Annals of 
Operations Research 205(1) 203-212. 

Kast, R., E. Luciano, L. Peccati. 1998. VaR and optimization. 2nd International Workshop on Preferences 
and Decisions, Trento, July, vol. 1. 1998. 

Klabjan, D., D. Simchi-Levi, M. Song. 2013. Robust stochastic lot-sizing by means of histograms. Production 
and Operations Management 22(3) 691-710. 

Lucas, A., P. Klaassen. 1998. Extreme returns, downside risk, and optimal asset allocation. The Journal of 
Portfolio Management 25(1) 71-79. 

Luedtke, J., S. Ahmed. 2008. A sample approximation approach for optimization with probabilistic con¬ 
straints. SIAM Journal on Optimization 19(2) 674-699. 

Natarajan, K., D. Pachamanova, M. Sim. 2009. Constructing risk measures from uncertainty sets. Operations 
Research 57(5) 1129-1141. 

Nemirovski, A., A. Shapiro. 2006. Scenario approximations of chance constraints. Probabilistic and random¬ 
ized methods for design under uncertainty. Springer, 3-47. 

Pardo, L. 2005. Statistical Inference Based on Divergence Measures. CRC Press. 

Philpott, A. B., V. L. de Matos. 2012. Dynamic sampling algorithms for multi-stage stochastic programs 
with risk aversion. European Journal of Operational Research 218(2) 470-483. 

Prekopa, A. 1995. Stochastic Programming. Springer. 

RiskMetrics. 1996. Riskmetrics: Technical Document. Morgan Guaranty Trust Company of New York. 

Rockafellar, R. T., S. Uryasev. 2000. Optimization of conditional value-at-risk. Journal of Risk 2 21-42. 

Sarykalin, S., G. Serraino, S. Uryasev. 2008. Value-at-risk vs. conditional value-at-risk in risk management 
and optimization. Tutorials in Operations Research. INFORMS, Hanover, MD . 

Scarf, H., K. J. Arrow, S. Karlin. 1958. A min-max solution of an inventory problem. Studies in the 
mathematical theory of inventory and production 10 201-209. 

Shapiro, A. 2006. On complexity of multistage stochastic programs. Operations Research Letters 34(1) 1-8. 



Qian, Wang and W^en: A Composite Risk Measure Framework for Decision Making under Uncertainty 
Article submitted to Operations Research-, manuscript no. (Please, provide the manuscript number!) 


35 


Shapiro, A., D. Dentcheva, A. Ruszczyhski. 2009. Lectures on Stochastic Programming: Modeling and Theory^ 
vol. 9. SIAM. 

Soyster, A. L. 1973. Convex programming with set-inclusive constraints and applications to inexact linear 
programming. Operations Research 21(5) 1154-1157. 

Wang, W., S. Ahmed. 2008. Sample average approximation of expected value constrained stochastic pro¬ 
grams. Operations Research Letters 36(5) 515-519. 

Wang, Z., P. Glynn, Y. Ye. 2013. Likelihood robust optimization for data-driven problems. Working paper. 

Wen, Z., X. Peng, X. Liu, X. Sun, X. Bai. 2013. Asset allocation under the basel accord risk measures. 
Working paper. 

Wiesemann, W., D. Kuhn, M. Sim. 2014. Distributionally robust convex optimization. Operations Research 


Xu, H., D. Zhang. 2009. Smooth sample average approximation of stationary points in nonsmooth stochastic 
optimization and applications. Mathematical Programming 119(2) 371-401. 

Zhu, S., M. Fukushima. 2009. Worst-case conditional value-at-risk with application to robust portfolio 
management. Operations Research 57(5) 1155-1168. 



